23 research outputs found
The Enabling Power of Graph Coloring Algorithms in Automatic Differentiation and Parallel Processing
Combinatorial scientific computing (CSC) is founded
on the recognition of the enabling power of combinatorial algorithms
in scientific and engineering computation and in high-performance computing.
The domain of CSC extends beyond traditional scientific computing---the
three major branches of which are numerical linear algebra,
numerical solution of differential equations, and
numerical optimization---to include a range of emerging and
rapidly evolving computational and information science disciplines.
Orthogonally, CSC problems could also emanate from
infrastructural technologies for supporting high-performance computing.
Despite the apparent disparity in their origins,
CSC problems and scenarios are unified by the following common features:
(A) The overarching goal is often to make computation
efficient---by minimizing overall execution time, memory usage,
and/or storage space---or to facilitate knowledge discovery or analysis.
(B) Identifying the most accurate combinatorial abstractions that
help achieve this goal is usually a part of the challenge.
(C) The abstractions are often expressed, with advantage, as graph
or hypergraph problems.
(D) The identified combinatorial problems are typically NP-hard to
solve optimally. Thus, fast, often linear-time, approximation (or
heuristic) algorithms are the methods of choice.
(E) The combinatorial algorithms themselves often need to be
parallelized, to avoid their being bottlenecks within a larger
parallel computation.
(F) Implementing the algorithms and deploying them via software
toolkits is critical.
This talk attempts to illustrate the aforementioned features of CSC
through an example: we consider the enabling role graph coloring
models and their algorithms play in efficient computation of
sparse derivative matrices via automatic differentiation (AD).
The talk focuses on efforts being made on this topic within
the SciDAC Institute for Combinatorial Scientific Computing
and Petascale Simulations (CSCAPES).
Aiming at providing overview than details, we discuss
the various coloring models used in sparse Jacobian and Hessian computation,
the serial and parallel algorithms developed in CSCAPES
for solving the coloring problems, and a
case study that demonstrate the efficacy of the coloring techniques
in the context of an optimization problem in a Simulated Moving Bed process.
Implementations of our serial algorithms for the coloring
and related problems in derivative computation are assembled
and made publicly available in a package called ColPack.
Implementations of our parallel coloring algorithms are
incorporated into and deployed via the load-balancing toolkit Zoltan.
ColPack has been interfaced with ADOL-C, an operator overloading-based
AD tool that has recently acquired improved capabilities for
automatic detection of sparsity patterns of Jacobians and Hessians
(sparsity pattern detection is the first step in derivative matrix
computation via coloring-based compression).
Further information on ColPack and Zoltan is available
at their respective websites, which can be accessed via
http://www.cscapes.or
Parallel Maximum Clique Algorithms with Applications to Network Analysis and Storage
We propose a fast, parallel maximum clique algorithm for large sparse graphs
that is designed to exploit characteristics of social and information networks.
The method exhibits a roughly linear runtime scaling over real-world networks
ranging from 1000 to 100 million nodes. In a test on a social network with 1.8
billion edges, the algorithm finds the largest clique in about 20 minutes. Our
method employs a branch and bound strategy with novel and aggressive pruning
techniques. For instance, we use the core number of a vertex in combination
with a good heuristic clique finder to efficiently remove the vast majority of
the search space. In addition, we parallelize the exploration of the search
tree. During the search, processes immediately communicate changes to upper and
lower bounds on the size of maximum clique, which occasionally results in a
super-linear speedup because vertices with large search spaces can be pruned by
other processes. We apply the algorithm to two problems: to compute temporal
strong components and to compress graphs.Comment: 11 page
Characterization of Anaplasma marginale subsp. centrale strains by use of msp1aS genotyping reveals a wildlife reservoir
Bovine anaplasmosis caused by the intraerythrocytic rickettsial pathogen Anaplasma marginale is endemic in South Africa.
Anaplasma marginale subspecies centrale also infects cattle; however, it causes a milder form of anaplasmosis and is used as a
live vaccine against A. marginale. There has been less interest in the epidemiology of A. marginale subsp. centrale, and, as a result,
there are few reports detecting natural infections of this organism. When detected in cattle, it is often assumed that it is
due to vaccination, and in most cases, it is reported as coinfection with A. marginale without characterization of the strain. A
total of 380 blood samples from wild ruminant species and cattle collected from biobanks, national parks, and other regions of
South Africa were used in duplex real-time PCR assays to simultaneously detect A. marginale and A. marginale subsp. centrale.
PCR results indicated high occurrence of A. marginale subsp. centrale infections, ranging from 25 to 100% in national parks.
Samples positive for A. marginale subsp. centrale were further characterized using the msp1aS gene, a homolog of msp1 of A.
mar-ginale, which contains repeats at the 5= ends that are useful for genotyping strains. A total of 47 Msp1aS repeats were
identified, which corresponded to 32 A. marginale subsp. centrale genotypes detected in cattle, buffalo, and wildebeest.
RepeatAnalyzer was used to examine strain diversity. Our results demonstrate a diversity of A. marginale subsp. centrale strains
from cattle and wildlife hosts from South Africa and indicate the utility of msp1aS as a genotypic marker for A. marginale
subsp. centrale strain diversity.http://jcm.asm.org2017-04-30hb2017Veterinary Tropical Disease
Speeding up parallel graph coloring
Abstract. This paper presents new efficient parallel algorithms for finding approximate solutions to graph coloring problems. We consider an existing shared memory parallel graph coloring algorithm and suggest several enhancements both in terms of ordering the vertices so as to minimize cache misses, and performing vertex-to-processor assignments based on graph partitioning instead of random allocation. We report experimental results that demonstrate the performance of our algorithms on an IBM Regatta supercomputer when up to 12 processors are used. Our implementations use OpenMP for parallelization and Metis for graph partitioning. The experiments show that we get up to a 70 % reduction in runtime compared to the previous algorithm.
Graph Coloring in Optimization Revisited
We revisit the role of graph coloring in modeling problems that arise in efficient estimation of large sparse Jacobian and Hessian matrices using both finite difference (FD) and automatic differentiation (AD) techniques, in each case via direct methods. For Jacobian estimation using column partitioning, we propose a new coloring formulation based on a bipartite graph representation. This is compare